Learn R Programming

bigmemory (version 3.6)

write.big.matrix, read.big.matrix: File interface for a ``big.matrix''

Description

Create a big.matrix by reading from a suitably-formatted ASCII file, or write the contents of a big.matrix to a file.

Usage

write.big.matrix(x, fileName = NA, row.names = FALSE, col.names = FALSE, sep=',')
read.big.matrix(fileName, sep = ',', header = FALSE, row.names = NULL, col.names = NULL,
        type = NA, skip = 0, separated = FALSE, shared = FALSE,
        backingfile = NULL, backingpath = NULL, preserve = TRUE)

Arguments

fileName
the name of an input/output file.
sep
a field delimiter.
header
if TRUE, the first line (after a possible skip) should contain column names.
row.names
if TRUE, use the first column of the file for row names; if a vector of names, use them even if row names appear to exist in the file.
col.names
if TRUE, use the first row of the file for column names; if a vector of names, use them even if column names exist in the file.
type
preferably specified, "integer" for example.
skip
number of lines to skip at the head of the file.
separated
use separated column organization of the data instead of column-major organization.
shared
if TRUE, load the object into shared memory.
backingfile
the root name for the file(s) for the cache of x.
backingpath
the path to the directory containing the file backing cache.
preserve
if this is a filebacked big.matrix, it is preserved, by default, even after the end of the R session unless this option is set to FALSE.

Value

  • a big.matrix object is returned by read.big.matrix, while write.big.matrix creates an output file in the present working directory.

Details

Currently, files must contain only one atomic type (all integer, for example). Once (if) we implement something like big.data.frame, this assumption will be relaxed. We have other ideas for useful options as well, including the reading and writing of subsets of columns. When reading from a file, if type is not specified we try to make a reasonable guess for you without making any guarantees at this point. The same is true for the field separator. Warning messages will be printed to alert you of this. Unless you have really large integer values, we strongly recommend you consider "short". If you have something that is essentially categorical, you might even be able use "char", with huge memory savings in large data sets.

See Also

big.matrix

Examples

Run this code
# Without specifying the type, this big.matrix x will hold integers.
x <- as.big.matrix(matrix(1:10, 5, 2))
x[2,2] <- NA
x[,]
write.big.matrix(x, "foo.txt")

# Just for fun, I'll read it back in as character (1-byte integers):
y <- read.big.matrix("foo.txt", type="char")
y[,]

# Other examples:
w <- as.big.matrix(matrix(1:10, 5, 2), type='double')
w[1,2] <- NA
w[2,2] <- -Inf
w[3,2] <- Inf
w[4,2] <- NaN
w[,]
write.big.matrix(w, "bar.txt")
w <- read.big.matrix("bar.txt", type="double")
w[,]
w <- read.big.matrix("bar.txt", type="short")
w[,]

Run the code above in your browser using DataLab